German Alcohol Language Corpus - the Question of Dialect
نویسندگان
چکیده
Bavarian Archive for Speech Signals (BAS), Ludwig-Maximilians-Universität München, Schellingstr. 3, 80799 München, Germany schiel|[email protected] Abstract Speech uttered under the influence of alcohol is known to deviate from the speech of the same person when sober. This is an important feature in forensic investigations and could also be used to detect intoxication in the automotive environment. Aside from acoustic-phonetic features and speech content which have already been studied by others in this contribution we address the question whether speakers use dialectal variation or dialect words more frequently when intoxicated than when sober. We analyzed 300,000 recorded word tokens in read and spontaneous speech uttered by 162 female and male speakers within the German Alcohol Language Corpus. We found that contrary to our expectations the frequency of dialectal forms decreases significantly when speakers are under the influence. We explain this effect with a compensatory over-shoot mechanism: speakers are aware of their intoxication and that they are being monitored. In forensic analysis of speech this “awareness factor” must be taken into account.
منابع مشابه
Globalization, Standardization, and Dialect Leveling in Iran
This paper is an attempt to shed light on the effects of modernization, urbanization, monolingual educational system, and mass media as well as the process of globalization on dialect leveling among Persian dialects. In so doing, the first part of the paper elaborates on the relationship between globalization and sociolinguistics, and on the concept of standardization. Also, it discusses some ...
متن کاملCompilation of a Swiss German Dialect Corpus and its Application to PoS Tagging
Swiss German is a dialect continuum whose dialects are very different from Standard German, the official language of the German part of Switzerland. However, dealing with Swiss German in natural language processing, usually the detour through Standard German is taken. As writing in Swiss German has become more and more popular in recent years, we would like to provide data to serve as a steppin...
متن کاملA Resource for Natural Language Processing of Swiss German Dialects
Since there are only a few resources for Swiss German dialects, we compiled a corpus of 115,000 tokens, manually annotated with PoStags. The goal is to provide a basic data set for developing NLP applications for Swiss German. We extended the original corpus and improved its annotation consistency. Furthermore, we trained dialect-specific PoS-tagging models and implemented a baseline system for...
متن کاملGerman broadcast news transcription
We describe a newly created broadcast news (BN) corpus based on programs of seven different German and Austrian TV stations and the development of a German BN transcription system based on this corpus. We report on a series of experiments addressing the fact that German is less suited than English for word-based trigram language models. Furthermore, we investigate various phoneme sets and exami...
متن کاملA Description of Derivational Affixes in Sarhaddi Balochi of Granchin
Sarhaddi Balochi dialect, a language variety of Western (Rakhshani) Balochi, employs derivation through affixation as one of its word formation processes. The purpose of this article is to present a synchronic description of the way(s) different derivational affixes function in making complex words in Sarhaddi Balochi as spoken in Granchin[1] district located about 35Kms to the southeast of Kha...
متن کامل